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Abstract 


We show that probabilistic computable functions, i.e., those func- 
tions outputting distributions and computed by probabilistic Turing 
machines, can be characterized by a natural generalization of Church 
and Kleene’s partial recursive functions. The obtained algebra, follow- 
ing Leivant, can be restricted so as to capture the notion of a polytime 
sampleable distribution, a key concept in average-case complexity and 
cryptography. 
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1 Introduction 


Models of computation as introduced one after the other in the first half of 
the last century, were all designed around the assumption that determinacy 
is one of the key properties to be modeled: given an algorithm and an input 
to it, the sequence of computation steps leading to the final result is uniquely 
determined by the way an algorithm describes the state evolution. The 
great majority of the introduced models are equivalent, in that the classes 
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of functions (on, say, natural numbers) they are able to compute are the 
same [4]. 

The second half of the 20th century has seen the assumption above 
relaxed in many different ways. Nondeterminism, as an example, has been 
investigated as a way to abstract the behavior of certain classes of algorithms, 
this way facilitating their study without necessarily changing their expressive 
power: think about how NFAs [18] make the task of proving closure properties 
of regular languages easier. 

A relatively recent step in this direction consists in allowing algorithms’ 
internal state to evolve probabilistically: the next state is not functionally 
determined by the current one, but is obtained from it by performing a process 
having possibly many outcomes, each with a probability. Probabilistically 
evolving computation (probabilistic computation for short) can be a way 
to abstract over determinism, but also a way to model situations in which 
algorithms have access to a source of randomness‘. Indeed, probabilistic 
models are nowadays more and more pervasive: not only they are a formidable 
tool when dealing with uncertainty and incomplete information, but they 
sometimes are a necessity rather than an option, like in computational 
cryptography (where, e.g., public key encryption schemes cannot be secure 
without being probabilistic [10]). Examples of application areas in which 
probabilistic computation has proved to be useful include natural language 
processing [15], robotics [22], computer vision [3], and machine learning [16]. 

But what does the presence of probabilistic choice give us in terms 
of expressivity? Are we strictly more expressive than usual, deterministic, 
computation? And what about efficiency: is it that probabilistic choice 
permits to solve computational problems more efficiently? These questions 
have been among the most central in the theory of computation, in particular 
in computational complexity, in the last forty years and they have received 
several different answers. We postpone to the next section a discussion of 
these answers, however we can already summarize two main points emerging 
from these results. First, while probability has been proved not to offer 
any computational advantage in the absence of resource constraints, it is 
not known whether probabilistic classes such as BPP or ZPP are different 
from P. Second, all the existing works on this subject follow an approach 
that we call reductionist: probabilistic computation is studied by reducing 


4 Although the physical sources of randomness algorithms have access to usually contain 
correlations and biases, they are modeled as sources of perfect randomness, in which bits 
are uniformly distributed and independent. 
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or comparing it to deterministic computation. 

This work goes in a somehow different direction: as already mentioned, 
we want to study probabilistic computation directly, without necessarily 
reducing it to deterministic computation. In our perspective, the central 
assumption is the following: a probabilistic algorithm computes what we 
call a probabilistic function, i.e. a function from a discrete set (e.g. natural 
numbers or binary strings) to distributions over the same set. What we 
want to do is to study the set of those probabilistic functions which can be 
computed by algorithms, possibly with resource constraints. 

In the first part of this paper we provide a characterization of computable 
probabilistic functions by the natural generalization of Kleene’s partial 
recursive functions, where among the initial functions there is now a function 
corresponding to tossing a fair coin, thus modeling the access to a source 
of randomness. In the non-trivial proof of completeness for the obtained 
algebra, Kleene’s minimization operator is used in an unusual way, making 
the usual proof strategy for Kleene’s Normal Form Theorem (see, e.g., [21]}) 
useless. We later hint at how to recover the latter by replacing minimization 
with a more powerful operator. We also mention how probabilistic recursion 
theory offers characterizations of concepts like the one of a computable 
distribution and of a computable real number. 

The second part of this paper is devoted to applying the aforementioned 
recursion-theoretical framework to polynomial-time computation. We do 
that by following Bellantoni and Cook’s and Leivant’s works [1, 13], in 
which polynomial-time deterministic computation is characterized by a re- 
stricted form of recursion, called predicative or ramified recursion. Endowing 
Leivant’s ramified recurrence with a random base function, in particular, is 
shown to provide a characterization of polynomial-time computable distribu- 
tions, a key notion in average-case complexity [2]. 

This is a revised and extended version of an eponymous paper appeared 
in the proceedings of the 11th International Colloquium on Theoretical 
Aspects of Computing [6]. 


Related Work. This work is rooted in the classic theory of computation, 
and in particular in the definition of partial computable functions as intro- 
duced by Church and later studied by Kleene [12]. Relevant related work 
includes the many probabilistic computational models introduced so far. 
Without trying to be exhaustive we can mention that, starting from the early 
fifties, various forms of automata in which probabilistic choice is available 
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have been considered (e.g., see [17]). The inception of probabilistic choice 
into an universal model of computation, namely Turing machines, is due 
to Santos [19, 20], but is (essentially) already there in an earlier work by 
De Leeuw and others [7]. Some years later, Gill [8] considered probabilistic 
Turing machines with bounded complexity: his work has been the starting 
point of a florid research about the interplay between computational com- 
plexity and randomness. Among the many side effects of this research one 
can of course mention modern cryptography [11], in which algorithms (e.g. 
encryption schemes, authentication schemes, and adversaries for them) are 
almost invariably assumed to work in probabilistic polynomial time. 

The second part of this work is related to the area of implicit computa- 
tional complexity, which studies machine-free characterizations of complexity 
classes based on mathematical logic and programming language theory, and 
which is a relatively young research area. Its birth is traditionally made to 
correspond with the beginning of the nineties, when Bellantoni and Cook [1] 
and Leivant [13] independently proposed function algebras precisely charac- 
terizing (deterministic) polynomial time computable functions. In the last 
twenty years, this area has produced many interesting results, and complexity 
classes spanning from the logarithmic space computable functions to the 
elementary functions have been characterized by, e.g., function algebras, type 
systems [14], or fragments of linear logic [9]. Recently, some investigations 
on the interplay between implicit complexity and probabilistic computation 
have started to appear [5]. There is however an intrinsic difficulty in giving 
implicit characterizations of probabilistic classes like BPP or ZPP: these 
are semantic classes defined by imposing a polynomial bound on time, but 
also appropriate bounds on the probability of error. This makes the task of 
enumerating machines computing problems in the classes much harder and, 
ultimately, prevents from deriving implicit characterizations of the classes 
above. Again, our emphasis here is different: we do not see probabilistic 
algorithms as artifacts computing functions of the same kind as the one 
deterministic algorithms compute, but we see probabilistic algorithms as 
devices outputting distributions. 


2 Probabilistic Recursion Theory 
In this section we provide a characterization of the functions computed 


by a probabilistic Turing machine (PTM) in terms of a function algebra a 
la Kleene. We first define probabilistic recursive functions, which are the 
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elements of our algebra. Next we define formally the class of probabilistic 
functions computed by a PTM. Finally, we show the equivalence of the two 
introduced classes. In the following, Rj 1) is the unit interval. 


Since PTMs compute probability distributions, the functions that we 
consider in our algebra have domain N* (the set of k-tuples in N) and 
codomain N — Rjo1; (rather than N as in the classic case). The idea is that 
if f(x) is a function which returns p € Rij) on input x € N, then p is the 
probability of getting y € N as the output when feeding f with the input z. 
We note that we could extend our codomain from N + Rig, to N™ > Roo 1, 
however we use N — Rjo 1; in order to simplify the presentation, at the same 
time being consistent with the classic literature on recursion theory. 


Definition 1 (Pseudodistributions, Probabilistic Functions) A pseu- 
dodistribution on N is a function D : N + Roo such that Yo ,cn P(n) < 1. 
Denen P(n) is often denoted by ))D. Let Py be the set of all pseudodis- 
tributions on N. A probabilistic function (PF) is a function from N* to 
Py. 


In the following we use the expression {nj’,...,nz*} to denote the pseu- 
dodistribution D defined as D(n) = >/,,,-n Pi- Observe that })D = pe Dj. 
When this does not cause ambiguity, a pseudodistribution is simply called 
a distribution. Please notice that probabilistic functions are always total 
functions, but their codomain is a set of distributions which do not necessar- 
ily sum to 1, but rather to a real number smaller or equal to 1, this way 
modeling the probability of divergence. For example, the nowhere-defined 
partial function Q : N — N of classic recursion theory becomes a probabilistic 
function which returns the empty distribution @ on any input. The first step 
towards defining our function algebra consists in giving a set of functions to 
start from: 


Definition 2 (Basic Probabilistic Functions) The basic probabilistic 

functions (BPF’s) are defined as follows: 

e The zero function z : N > Py defined as: z(x)(0) =1 for every x € N; 

e The successor function s : N — Py defined as: s(x)(a +1) =1 for every 
x EN; 

e The projection function II”, : N” > Py defined as: II? (41,..-,2n)(@m) = 
1 for every positive n,m €N such that l1<m<n; 
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e The fair coin function r: N > Py that is defined as: 


1/2 ify=a; 
r(x)\(y)=4 1/2 ify=axt+]; 
0 otherwise. 


The first three BPFs are essentially the same as the basic functions from 
classic recursion theory, while r is the only truly probabilistic BPF: it behaves 
like the identity or like the successor, each with probability 5 It is worth 
noting that, as we will show in Example 1, this definition of r allow us to 
obtain probabilistic choice. 

The next step consists in defining how PFs compose. Function composi- 
tion of course cannot be used here, because when composing two PFs f and 
g the codomain of g does not match with the domain of f. Indeed, g returns 
a distribution N + Rio; while f expects a natural number as input. What 
we have to do here is the following. Given an input x € N and an output 
y € N for the composition f e g, we apply the distribution g(a) to a generic 
value z € N. This gives a probability g(x)(z) which is then multiplied by 
the probability that the distribution f(z) associates to the value y € N. If 
we then consider the sum of the obtained product g(x)(z) - f(z)(y) on all 
possible z € N we obtain the probability of f e g returning y when fed with 
x. The sum is due to the fact that two different values, say 21, z2 € N, which 
provide two different distributions f(z) and f(z2) must both contribute to 
the same probability value f(z1)(y) + f(z2)(y) for a fixed y. In other words, 
we are doing nothing more than lifting f to a function from distributions to 
distributions, then composing it with g. Formally: 


Definition 3 (Composition) We define the composition f eg:N— Py 
of two functions f : N— Py and g:N-— Py as: 


((f © 9)(2))(y) = 5 f(y) - 9(2) 2). 


zEN 


Please note that function composition as defined above is precisely the same 
as convolution from functional analysis. The previous definition can be 
generalized to functions taking more than one parameter in the expected 
way: 


Definition 4 (Generalized Composition) We define the generalized com- 
position of functions f : N° + Py, g) : N* > Py,...,9gn : N¥ 3 Py as the 
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function f © (g1,---;9n) : N* > Py defined as follows: 


(FO (G.-Y) = SD flerr-zn)(y)> TL gil) (e- 


Z15---52nEN 1l<i<n 


With a slight abuse of notation, we can treat probabilistic functions as 
ordinary functions when forming expressions. Suppose, as an example, 
that x € N and that f : N? > Py, g: N > Py, h: N > Py. Then the 
expression f (g(x), 2, h(x)) stands for the distribution in Py defined as follows: 
(f © (g, id, h))(x), where id = II} is the identity PF. 

The way we have defined probabilistic functions and their composition 
is reminiscent of, and indeed inspired by, the way one defines the Kleisli 
category for the Giry monad, starting from the category of partial functions 
on sets. This categorical way of seeing the problem can help a lot in finding 
the right definition, but by itself is not adequate to proving the existence of 
a correspondence with machines like the one we want to give here. 

Primitive recursion is defined as in Kleene’s algebra, provided one uses 
composition as previously defined: 


Definition 5 (Primitive Recursion) Given functions g : N**+? + Py, 
and f :N* > Py, the function h : N*+! — Py defined as 


h(x, 0) = f(x); h(x,y + 1) = g(y, x, A(x, y)); 


is said to be defined by primitive recursion from f and g, and is denoted as 


rec(f, 9). 


We now turn our attention to the minimization operator which, as in 
the deterministic case, is needed in order to obtain the full expressive power 
of (P)TMs. The definition of this operator is in our case delicate and requires 
some explanation. Recall that, in the classic case, given a partial function 
f : N+! ~N, the minimization operator allows one to define another partial 
function, call it uf, which computes from x € N* the least value of y such 
that f(x,y) is equal to 0 (and f(x, z) is defined and different from 0 for 
all z < y), if such a value exists (and is undefined otherwise). In our case, 
again, we are concerned with distributions, hence we cannot simply consider 
the least value on which f returns 0, since functions return 0 with a certain 
probability. The idea is then to define the minimization uf as a function 
which, given an input x € N*, returns a distribution associating to each 
natural y the probability that the result of f(x,y) is 0 and the result of 
f(x, z) is strictly positive for every z < y. Formally: 
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Definition 6 (Minimization) Given a PF f : N*t! > Py, we define 
another PF pf : N* > Py as follows: 


(HP) x)(y) = F(x, ¥) (0) - 0 STs 2\)) 


z<y k>0 


We are finally able to define the class of functions we are interested in as 
follows. 


Definition 7 (Probabilistic Recursive Functions) The class P& of 
probabilistic recursive functions is the smallest class of probabilistic functions 
that contains the BPFs (Definition 2) and is closed under the operations of 
general composition (Definition 4), primitive recursion (Definition 5) and 
minimization (Definition 6). 


Example 1 The following are examples of probabilistic recursive func- 
tions: 
e The identity function id : N > Py is defined as follows: for all x,y © N 


satay = 4 9 ena 


0 otherwise. 


This definition means that id = II}. Since the latter is a BPF (Definition 
2), id is in PR. 


e The probabilistic function rand : N > Py such that, for every x EN, 
1/2 ify=0; 
rand(x)(y)= 4 1/2 ify=1; 
0 otherwise; 


can be easily shown to be recursive, since rand = r©z (and we know that 
both r and z are BPF). Actually, one can easily see that rand could itself 
be taken as the only genuinely probabilistic BPF, i.e., r can be constructed 
from rand and the other BPFs by composition and primitive recursion. 

All functions we have proved recursive so far have the property that the 
returned distribution is finite and total for any input. Indeed, this is true 
for every probabilistic primitive recursive function, since minimization 
is the only way to break finiteness and totality. Consider the function 
f :N—- Py defined as 


1 * 
_ Qy—2F1 ify = @; 
F(®)(y) { 0 otherwise. 


Probabilistic Recursion Theory 
and Implicit Computational Complexity 185 


We define another function h: N > Py by stipulating that 


h(2)(uy) = seer 


for every x,y EN. h is a probabilistic recursive function; indeed, consider 
the function k : N? + Py defined as randOU? and build uk. By definition, 


(wk)(a)(y) = k(a,y)(0)- [] 5 &(, 2)(Q). (1) 
z<y q>0 
Then observe that (uk)(x)(y) = spt! by (1), (uk)(x)(y) unfolds into 
a product of exactly y+ 1 copies of 5, each “coming from the flip of a 
distinct coin”. Hence, h = wk. Then we observe that 


(add © (yk, id))(x)(y) = S 7 add(z1, z2)(y) - ((uk)(a) (a1) - id(x)(z2)). 


21522 


But notice that id(x)(z2) = 1 only when zg = x (and in the other cases 
ide) =O) hae) = Sa and add(z1, z2)(y) = 1 only when 
zi +22 = y (and in the other cases, add(z1, 22)(y) = 0). This implies that 
the term in the sum is different from 0 only when z2 = x and z1+22=y, 
namely when z1 = y— z2 =y—2, and in that case its value is yo: 


Thus, we can claim that f = (add © (wk, id)), and that f is in PB. 


It is easy to show that AF includes all partial recursive functions, seen as 
probabilistic functions. This can be done by first defining extended recursive 
functions as follows. 


Definition 8 (Extended Recursive Functions) For every partial recur- 
sive function f :N* +N we define the extended function Pr: N* > Py as 


follows: . 
0 a er ae 


0 otherwise. 


Proposition 1 Jf f is partial recursive function, then pr as defined above 


is in PR. 


Proof. The proof goes by induction on the structure of f as a partial 
recursive function. The proof for the base cases is immediate. As for the 
inductive cases, we have the following ones: 
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e f is defined by composition from h, 91,..., Gn as: 


f(x) = h(gi(x), aise :9n(X)), 


where h : N”? — N and g; : N* — N for every 1 < i < n are partial 
recursive functions. By definition, we have 


p7(x)(y) = { 1 ity = han(x)---aaC0) 


Also by definition, we have that for every 1 <i <n, it holds that: 


{ 1 if y = g(x); 


Pg (X)(y) = O otherwise; 


_f 1 ity=Ae); 
Pr(x)(y) =} 0 otherwise. 


By induction hypothesis, we observe that py,,...,Pg,,Ph © AE and 


((PrO(Pgi> +++» Pan))(X))(Y) 


SY palers--+s2n)(y)- | TD por (x) (zs) 


215-0 2nEN 1l<i<n 


= pp (x)(y). 


by using the definitions above. Thus pr = (pp © (Pg,---+Pgn)) © FE: 
e f is defined by primitive recursion so f : N* x N > N is defined as: 


f(x, 0) = h(x); 
f(xy+)) =g9(y,x, f(x,y); 


where g : N*+? -> N and h : N*¥ > N are partial recursive functions. 
By induction hypothesis, we have that pg,pp © AF. The fact that 
rec(Dg, Ph)(X,Y¥) = p(X, y) can be proved by an induction on y. Thus p 
is in PE because pf = rec(pn, Pg)- 

e Suppose f : N* — N is defined by minimization, ic. f = pg. By 
definition of f(x) we have: 


_ Jf 1 ifz=py-.(g(x, y) = 9); 
p(x) (2) = { 0 otherwise; 
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By hypothesis py € Y&. We observe that: 


(HPg)(x)(z) = Dg(x, 2)(0) - (0 do Pa (x, nt) 


n<zk>0 


=p,(x,z)0)-{]] do 1 


nN<z k=g(x,n)>0 
_ f 1 if g(x, z) =0 and Vn < z.9(x, z) > 0; 
~ | 0. otherwise; 


_ { 1 if z = py.(g(x, y) = 0) 
0 otherwise; 
= p#(x)(2). 


Thus pr is in YB because pr = LU pg. 
This concludes the proof. 


2.1 Probabilistic Turing Machines and Computable Func- 
tions 


In this section we introduce computable probabilistic functions as those 
probabilistic functions which can be computed by probabilistic Turing ma- 
chines. As previously mentioned, probabilistic computation models have 
received a wide interest in computer science already in the fifties [7] and early 
sixties [17]. A natural question which arose was then to see what happened 
if random elements were allowed in a Turing machine. This question led 
to several formalizations of probabilistic Turing machines [7, 19] — which, 
essentially, are Turing machines which have the ability to flip coins in order 
to make uniform, fair, decisions — and to several results concerning the 
computational complexity of problems when solved by PTMs {8}. 
Following [8], a probabilistic Turing machine (PTM) can be seen as a 
Turing machine with two transition functions do, 6,. At each computation 
step, either 69 or 6, can be applied, each with probability 5: Then, in a way 
analogous to the deterministic case, we can define a notion of a (initial, final) 
configuration for a PTM. In the following, 4, denotes the set of possible 
symbols on the tape, including a blank symbol O; @ denotes the set of states; 
Q; © @ denotes the set of final states and g, € @ denotes the initial state. 


Definition 9 (Probabilistic Turing Machine) A probabilistic Turing ma- 
chine (PTM) is a Turing machine endowed with two transition functions 
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60,01. At each computation step the transition function 69 can be applied 
with probability 5 and the transition 6, can be applied with probability 5. 


Definition 10 (Configuration of a PTM) Let M be a PTM. We define 

a PTM configuration as a 4-tuple (s,a,t,q) € Uf x My x LE x Q such 

that: 

e The first component, s € Ut, is the portion of the tape lying on the left 
of the head. 

e The second component, a € Xp, is the symbol the head is reading. 

e The third component, t € U7, is the portion of the tape lying on the right 
of the head. 

e The fourth component, q € Q is the current state. 

Moreover we define the set of all configurations as Cjyy = 4; X Mp X YF Xx Q. 


Definition 11 (Initial and Final Configurations of a PTM) Let M be 
a PTM. We define the initial configuration of M for the string s as the 
configuration (€,a,V,qs) € L; Xx Up X LF X Q such that s = a-v and the 
fourth component, q,; € Q, is the initial state. We denote it with INj,. 
Similarly, we define a final configuration of M for s as a configuration 
(s,O,€,q¢) © Up x Uy x LF X Qy. The set of all such final configurations for 
M is denoted by FCj,. 


For a function T': N > N, we say that a PTM M runs in time bounded by T 
if for any input x, M halts on input x within T(|z|) steps independently of 
the random choices it makes. Thus, M works in polynomial time if it runs 
in time bounded by P, where P is any polynomial. 

Intuitively, the function computed by a PTM ™ associates to each 
input s € &* a pseudodistribution which indicates the probability of reaching 
a final configuration of M from ZNj,. It is worth noticing that, differently 
from the deterministic case, since in a PTM the same configuration can 
be obtained along different computation paths, the probability of reaching 
a given final configuration is the swum of the probabilities of reaching the 
configuration along all computation paths, of which there can be (even 
infinitely) many. It is thus convenient to define the function computed by 
a PTM through a fixpoint construction, as follows. First, we can define a 
partial order on the string distributions. 


Definition 12 A string (pseudo)distribution on * is a function D : %* > 
Ryo) such that >) esx D(s) < 1. Py» denotes the set of all string (pseudo) 
distributions on U*. 
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Next we can define a partial order on string distributions by a pointwise 
extension of the usual order on R: 


Definition 13 The relation Ep, C Pyx x Py» is defined by stipulating that 
A Cp, B if and only if, for all s € &*, A(s) < B(s). 


The proof of the following is immediate. 


Proposition 2 The structure (Py»,Ep,,,.) is a poset. 
It is time to define the domain CEV: 
Definition 14 The set CEY is defined as {f | f : Cu; > Ps«}. 


The set CEV will be used as the domain® of the functional whose least 
fixpoint gives the function computed by a PTM. To this aim, inheriting the 
structure on Py», we can define a partial order on CEY as follows. 


Definition 15 The relation CeeyC CEV x CEY is defined for A,B € CEV 
as A Ecey B if and only if, for alle € Cy, A(c) Ep, Bic). 


The proof of the following is also immediate. 


Proposition 3 The structure (CEV,Ccey) is a poset. 


Given a poset, the notions of least upper bound, denoted by |_|, and of an 
ascending chain are defined as usual. Next, the bottom elements of the 
posets of our interest can be obtained as follows. 


Lemma 1 Let d, : &* + Roo be defined by stipulating that d,(s) =0 for 
alls € &*. Then, d, is the bottom element of the poset (Py=, Ep,.)- 


Lemma 2 Let 6, : Cy — Ps» be defined by stipulating that b,(c) = d, for 
allc € Cy. Then, b, is the bottom element of the poset (CEV, Ccey). 


Now, it is time to prove that the posets at hand are also w-complete, that is, 
that each ascending chain in the poset has a least upper bound in it. 


Proposition 4 The poset (Py, Cp,.) is a wCPO. 


5Of course CEV is a proper superset of the functions computed by PTMs. 
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Proof. We need to prove that for each chain c; Ep,, co Ep,. c3... the 
least upper bound |], c; exists. First note that since }).-y. ci(s) < 1, from 
definition of Ep, it follows that, for each s € b*, c1(s) < e(s) <... <1 
holds. This implies that, for each s € X*, the limit lim;_,.. c;(s) exists. 
Hence, defining cg as the distribution such that cc(s) = limj..¢(s), we 
have that cc = LJ; ¢. Indeed, eg Ip,,, ¢;, and any upper bound of the family 
{ci}ien is clearly greater or equal to cg. 


Proposition 5 The poset (CEV,CEcey) is a wCPO. 


Proof. Analogous to the previous one. 


We can now define a functional F’yy on CEV which will be used to define 
the function computed by M via a fixpoint construction. Intuitively, the 
application of the functional Fyy describes one computation step. Formally: 


Definition 16 Given a PTM M, we define a functional Fyy : CEV > CEYV 
as: 
— if tet if C € FChy; 
Fu(f)(C) = { 5 f (50(C)) + 5 f(51(C)) otherwise. 
Note that, according to the notation introduced after Definition 1, {s!} 
is the distribution which assigns probability 1 to s. The following proposition 
is needed in order to apply the usual fixpoint result. 


Proposition 6 The functional Fy is continuous. 


Proof. We prove that Fir (Len fi) = Lien Pu (fi), or, saying another way, 
that for every configuration C, Fir (Len fi)(C) = Uen(Fu (fi) (C). Now, 
notice that for every C, 


; =f fs} if C € FCh;; 
Pal | £010) = { KL ACH) + 8(Lhew ANCH)) HO Or 


where C — C1, C2 means that, from the configuration C’, the machine with 
one computation step evolves to the configurations Cy and C2 and, similarly, 
that: 


| = {st} if C € FChy; 
LFatfn(c) = Ut LAGI) + LC) AOS eon 


Now, given any C, we distinguish two cases: 
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e If C € FCjy,, then 


Fu(L] fi(C) = {8} = | ]fs"} = LJ u(fi))(C). 


iEN ieN ieN 


elfC7 C1, Co, then 


Fu(L] AC) = 5 (|_| fac )(C1)) SL] f( )(C2)) 


ieN ieN ieN 
= 5 Sil (C1)) 5 fil (C2)) 
iEN ieN 
=| Bais tL | 5 fi(C2) = LG (0%) ‘is 5 fi(C2)) 
iN iN iN 
= | |(Fu(f))(C) 
ieN 


This concludes the proof. 


Theorem 1 The functional Fy from Definition 16 has a least fixpoint which 
is equal to ||,39 Fy (61). 


Proof. Immediate from the well-known fixpoint theorem for continuous 
maps on a wCPO. 


Such a least fixpoint is, once composed with a function returning ZN}, 
from s, the function computed by the machine M, which is denoted as 
TOm : &* — Ps«. A probabilistic function is computable if it is the function 
computed by any PTM M. The set of computable functions is PY. 

The fixpoint construction delineated above is an appropriate way to 
define what a PTM computes, although working with it can be quite cum- 
bersome in proofs. A better, equivalent, definition consists in working with 
computation trees, each of which represents all probabilistic computation 
paths of a machine M when fed with a given input string x. We define such 
a tree as follows. Each node is labelled by a configuration of the machine 
and each edge represents a computation step. The root is labelled with 
the initial configuration ZNj, and each node labelled with C has either 
no child (if C' is final) or 2 children (otherwise) labelled with d9(C) and 
61(C). Please notice that the same configuration may be duplicated across 
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2S abo “3 = —— 


(c) 


Figure 1: Computation Trees, some Examples 


a single level of the tree as well as appear at different levels of the tree; 
nevertheless we represent each such appearance by a separate node. In 
Figure 1(a), an example computation trees is depicted such that C is the 
initial configuration, E and G are final configurations, while D and F are 
neither initial nor final. The same computation tree can be represented more 
abstractly as in Figure 1(b), or even as in Figure 1(c), where we focus on 
one among the nodes labelled with E. 


We can naturally associate a real number to each node of a computa- 
tion tree, corresponding to the probability that the node is reached in the 
computation: it is mat where n is the height of the node. The probability of 
a particular final configuration is the sum of the probabilities of all leaves 
labelled with that configuration. We also enumerate nodes in the tree, 
top-down and from left to right, by using binary strings in the following 
way: the root has associated the number ¢. Then if b is the binary string 
representing the node N, the left child of N has associated the string b- 0 
while the right child has the string b- 1. Note that from this definition it 
follows that each binary number associated to a node N indicates a path in 
the tree from the root to N. The computation tree whose root is labelled 
with ZN;{, will be denoted as CT)y (2). 


It is worth noticing that the notion of computable probabilistic function 
we have described subsumes other key notions in probabilistic and real- 
number computation. As an example, computable distributions can be 
characterized as those distributions on * which can be obtained as the 
result of a function in Y@ on a fixed input. Analogously, computable real 
numbers from the unit interval [0,1] can be seen as those elements of R in 
the form f(0)(0) for a computable function f € AS. 
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2.2 Equivalence 


In this section we prove that probabilistic recursive functions are the same as 
probabilistic computable functions, modulo an appropriate bijection between 


strings and natural numbers, which we denote (as its inverse) with (-). 


2.2.1 Soundness 


In order to prove the equivalence result we first need to show that any 
probabilistic recursive function can be computed by a PTM. This result 
is not difficult and, analogously to the deterministic case, is proved by 
exhibiting PTMs which simulate the basic probabilistic recursive functions 
and by showing that Y@ is closed under composition, primitive recursion, 
and minimization. This is done by the following lemmata. 


Lemma 3 (Basic Functions and Computability) All basic probabilis- 
tic functions are computable. 


Proof. For every basic function from Definition 2, we can construct a 
probabilistic Turing machine that computes it. More specifically, the proof 
is immediate for the BPFs z, s and II?”: they are deterministic, thus we 
can use the usual Turing machines (seen as a PTMs) for them. As for the 
function r it can be simulated by a PTM M which writes 1 or 0 on the tape, 
both with probability 53 and then halts. 


The composition of two computable probabilistic functions is itself com- 
putable: 


Lemma 4 (Composition and Computability) Given computable f : N° > 
Py and g, : N* > Py,...,gn : N* > Py, the function f © (g1,.--,9n) : 
N* > Py is itself computable. 


Proof. We give the proof for the case when n = 1, i.e., the case in which 
the function to be proved computable is f eg : N* —> Py (the general case 
is analogue, if only a bit more tedious). By hypothesis, f and g are both 
computable, and thus there are PTMs which compute them, say N and M 
respectively. We define a PTM L, working on 2 tapes®, and which computes 
feg. On the first tape Z simulates M by computing the value of g on the 


®The equivalence of multi tape PTMs with single tape PTMs can be proved in a way 
which is analogous to the classic case. 
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Figure 2: Computation Tree for Composition 


input, while on the second tape L simulates N by computing the result of f 
on the result of g. A bit more in detail, the machine L operates as follows 
on a input z: 

1. it first computes g(x) by simulating M on the first tape; 

2. it then copies the content y of the first tape to the second tape; 

3. it finally computes the function f(y) by simulating N, and obtaining z. 
But is it that LZ correctly computes f eg? On the one hand, one can observe 
that the computation tree CT;,() has a structure like the one in Figure 2, 
where Ty, has the same structure as C'T)4(x), C corresponds to a final 
configuration for M and y, and Ty, has the same structure as CT(y). Now, 
for every D which is final for N and z, the probability of reaching the 
corresponding configuration in CT,(z) is clearly 


do 9@)(y)- FY)(2) = (F © 9)(@))@), 
y 


which is the thesis. 


Lemma 5 (Primitive Recursion and Computability) Given computable 
functions g : N*+? — Py and f : N* > Py, the function rec(f,g) : N*+1 > 
Py is itself computable. 


Proof. By hypothesis, f and g are both computable, and thus there are 

PTMs which compute them, say M and N respectively. We define a PTM L 

computing rec(f,g) and working on 5 tapes. The first tape is the input tape, 

on the next tape DL keeps track of a counter, on the third tape L computes 

g, on the fourth tape LZ computes f, and in the last tape it saves the result. 

The machine L operates as follows: 

1. it copies to the second tape the value 0 and then it copies to the fourth 
tape (an encoding of) the first k elements of the input; 

2. it computes f by simulating M on the fourth tape and saves the result 
on the last tape; 
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aro 


Figure 3: Computation Tree for Primitive Recursion 


3. it verifies if the second tape contains the k + 1%" element of the input 
(which is on the first tape). In this case L stops and the last tape contains 
the result, otherwise it copies the first k elements of the input from the 
first tape to the third tape, then it copies the value on the second tape 
to the third tape and then it copies the result present on the last tape to 
the third tape; 

4. L increments the value on the second tape; 

5. it computes g on the third tape and saves the result on the last tape; 

6. it goes back to Step 3. above. 

One can observe that the computation tree CT,(x,n) has a structure like 

the one in Figure 3, where: 

> (ee has the same structure as CT)y4(x); 

e Co corresponds to a final configuration for M and x; 

e for every O<i<n, Ten) has the same structure as CT (i, x, y:). 

e for every 0 <i <n, Ci41 corresponds to a final configuration for N and 
Yi- 

Now, for every D which is final for N and z, the probability of reaching such 

a configuration in C'T,(x, n) can be proved to be equal to (rec(f, g))(x, n)(z), 

by induction on n. 


Lemma 6 (Minimization and Computability) Given a computable func- 
tion f : N*+! — Py, the function (uf) : N* > Py is itself computable. 


Proof. We proceed as in Lemma 4 and Lemma 5. Since f is computable, 
there is a PTM M which computes it. We define a PTM N which works 
with 4 tapes, and which computes wf. The first tape is the input tape, in the 
second tape N saves a counter that corresponds to the y (in the definition 
of minimization) and which is incremented iteratively, on the third tape 
it computes the function f and in the last tape it saves the result. The 
machine M operates as follows: 

1. it writes 0 to the second tape; 
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Figure 4: Computation Tree for Minimization 


2. it copies to the third tape the input and the value of the second tape as 
a tuple; 

3. it computes on the third tape the function f and saves the result on the 
last tape; 

4. it checks whether the last tape contains the value 0; in this case it saves 
on the last tape the element in the second tape and it stops, otherwise it 
increases the value of the second tape by 1; 

5. it returns to the step 2. above. 

One can observe that the computation tree CT,(x) has a structure like the 

one in Figure 4, where for every y > 0: 

e To) has the same structure as CTy7(x, y); 

e C, corresponds to a final configuration for M and k, and is actually 
linked to the initial configuration of a a only if & is different than 0. 

By the usual kind of reasoning, we can prove that the probability of reaching 

a given final configuration of C' in L is precisely the one given from the 

definition of wf. 


We can finally prove the following theorem, showing that all probabilistic 
recursive functions are computable: 


Theorem 2 YZ C PE 


Proof. The fact that f € AB implies f € Y@& can be proved by induction 
on the structure of the proof that f € AB, where lemmas 3, 4, 5 and 6 
each handle an inductive case. 


2.2.2 Completeness 


The most difficult part of the equivalence proof consists in proving that 
each probabilistic computable function is actually recursive. Analogously to 
the classic case, a good strategy consists in representing configurations as 
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natural numbers, then encoding the transition function of the machine at 
hand, call it 1, as a (recursive) function on N. In the classic case the proof 
proceeds by making essential use of the minimization operator to determine 
the number of transition steps of M necessary to reach a final configuration 
(if such a number exists). This number can then be fed to another function 
which simulates M (on an input) a given number of steps, and which is 
primitive recursive. In our case, this strategy does not work: the number of 
computation steps can be infinite, even when the convergence probability is 
1. 

Before entering into the technicalities, we need some preliminary def- 
initions. First we need to encode the rational numbers Q into N. Let 
pair: Nx NN be any recursive bijection between pairs of natural num- 
bers and natural numbers such that pair and its inverse are both computable. 
Let then enc be just Ppair, i.e. the function enc : N x N > Py is defined as 


follows: 
1 if qg=pair(a,b); 
enc(a, b)(q) = { 0 ee 


The function enc allows us to represent positive rational numbers as pairs 
of natural numbers in the obvious way and is probabilistic recursive. It is 
now time to define a few notions on computation trees 


Definition 17 (Computation Tree and String Probability) The func- 
tion PTy : NxN > Q is defined by stipulating that PT (x,y) is the 
probability of observing the string ¥ in the tree CTyy(x), namely sit: 

Of course, PT is partial recursive, thus ppr,, is probabilistic recursive. 
Since the same configuration C' can label more than one node in a compu- 
tation tree CTyy(x), PT yy does not indicate the probability of reaching C, 
even when C’ is the label of the node corresponding to the second argument. 
Such a probability can be obtained by summing the probability of all nodes 
labelled with the configuration at hand: 


Definition 18 (Configuration Probability) Suppose given a PTM M. 
Ifx € N andz € Cy, the subset CC (a, z) of N contains precisely the indices 
of nodes of CTy4(x) which are labelled by z. The function PCy :NxN>Q0 
is defined as follows: PC (x, 2) = Syeccy(2,z) PT (2, y). 


Contrary to PT y, there is nothing guaranteeing that PC yy is indeed com- 
putable. In the following, indeed, we will not prove completeness through a 
proof of computability for PCy,, but rather through a long detour. 
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Please recall the example computation tree CT),(x) for an hypothetical 
PTM WM and an input z as in Figure 1(a). As can be easily checked, 
PCm(2,C) =1, while PC y4(a, E) = 3. Indeed, notice that there are three 
nodes in the tree which are labelled with E, namely those corresponding to 
the binary strings 00, 01, and 10. 

As we already mentioned, our proof separates the classic part of the 
computation performed by the underlying PTM, which essentially computes 
the configurations reached by the machine in different paths, from the 
probabilistic part, which instead computes the probability values associated 
to each computation by using minimization. These two tasks are realized by 
two suitable probabilistic recursive functions, which are then composed to 
obtain the function computed by the underlying PTM. We start with the 
probabilistic part, which is more complicated. 

We need to define a function which returns the conditional probabil- 
ity of terminating at the node corresponding to the string 7 in the tree 
CTyy(x), given that all the nodes Z where z < y are labelled with non-final 
configurations. This is captured by the following definition: 


Definition 19 Given a PTM M, we define PT), :N x N > Q and PT}, : 
NxN- Q as follows: 


1 =i a if y is not a leaf of CTyy(x); 
PT y(,y) = { 1—PT‘,(a,y) otherwise; 
0 if y is not a leaf of CTy4(x); 


PT4,(a, y) = _ PT M (oy) 
Tey PPE) 


otherwise; 

Note that, according to the previous definition, PT1,(z, y) is the probability 
of not terminating the computation in the node y, while P Toe. y) represents 
the probability of terminating the computation in the node y, both knowing 
that the computation has not terminated in any node & preceding y. 


Proposition 7 The functions PT),:NxN-—Q and PT},:NxN+Q 
are partial recursive. 


Proof. Please observe that PT yy is partial recursive and that the definitions 
above are mutually recursive, but the underlying order is well-founded. Both 
functions are thus intuitively computable, thus partial recursive by the 
Church-Turing thesis. 
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The reason why the two functions above are useful is because they associate 
the distribution {0?2(@-9), 1?Tim(@)} to each pair of natural numbers (2, y). 
In Figure 5, we give the quantities we have just defined for the tree from 
Figure l(a). Each internal node is associated with the same distribution 
{0°, 11}. Only the leaves are associated with nontrivial distributions. As an 
example, the distribution associated to the node 01 is {03, 13}, because we 
have that 


= PT (2, 01 
PTY (2,01) = MEY) 
[<or PT yy (2, k) 


1 
4: PT},(x, 00) - PT g(a, 1) - PT 5y(x,0) - PT hy(@, 2) 
1 
~ 4. PTI, (x, 00) 
As it can be easily verified, PT },(x,00) = }. Thus, PT§,(x, 01) = 3. 


Figure 5: The Conditional Probabilities for the Computation Tree from 
Figure 1(a) 


We now need to go further, and prove that the probabilistic function 
returning, on input (x, y), the distribution {0??u(@), 1?Tu(.)} is recursive. 
This is captured by the following definition: 
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Definition 20 Given a PTM M, the function PTCy :NxN — Py is 
defined as follows 


PTY) (2,9) if z = 0; 
PTCm(z,y)(z)=4 PTy(z,y) if2z=1; 
0 otherwise. 


The function PTC'y, is really the core of our encoding. On the one hand, we 
will show that it is indeed recursive. On the other, minimizing it is going to 
provide us exactly with the function we need to reach our final goal, namely 
proving that the probabilistic function computed by M is itself recursive. 
But how should we proceed if we want to prove PTC y to be recursive? 
The idea is to compose p PT, with a function that turns its input into the 
probability of returning 1. This is precisely what the following function does: 


Definition 21 The function I2P : Q — Py is defined as follows 


zr f(O<x<1)A(y=1); 
I2P(a\(y)=< 1-2 if(0<a2<1)A(y=0); 
0 otherwise. 


Please observe how the input to J2P is the set of rational numbers, as usual 
encoded by pairs of natural numbers. Previous definitions allow us to treat 
(rational numbers representing) probabilities in our algebra of functions. 
Indeed: 


Proposition 8 The probabilistic function I2P is recursive. 


Proof. We first observe that h : N —> Py defined as h(x)(y) = a is 
a probabilistic recursive function, because h = p (rand © II?). Next we 
observe that every g € QM [0,1] can be represented in binary notation as: 
a = dcien — where c? € {0,1} (ie., cf is the i-th element of the binary 
representation of q). Moreover, a function computing such a c? from q and i 
is partial recursive. Hence we can define b: N x N > Py as follows 


slain) = { 


and conclude that b is indeed a probabilistic recursive function (because P& 
includes all the partial recursive functions, seen as probabilistic functions). 
Observe that: 


Le gp ch 
O otherwise; 


e ify=1; 


ad ={ Foe #en0 
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From the definition of composition, it follows that 


(b© (id, h))(q)(y) = D5 ber, a2)(y) - td(q) (#1) - h(q) (x2) 


= S~d(q, v2)(y) - h(q) (2) 


= S| b(g, 2) (y) : — 


im 
ae sept ify=l1 


= 1—c4 - 

= yoke soit ify =0 
0 otherwise 
q ify=1 

=< l-q ify=0 
0 otherwise 


This shows that I2P = b © (id,h), and hence that [2P is probabilistic 
recursive. 


The following is an easy corollary of what we have obtained so far: 


Proposition 9 The probabilistic function PTC yy 1s recursive. 


Proof. Just observe that PTC y = I12P © ppr.. 
The probabilistic recursive function obtained as the minimization of PTC yy 


allows to compute a probabilistic function that, given x, returns y with 
probability PT yy(a, y) if y is a leaf (and otherwise the probability is just 0). 


Definition 22 The function CFyy :N > Py is defined as follows 


_ f PTu(z,y) ify corresponds to a leaf; 
CFm(a)(y) = { 0 otherwise. 


Proposition 10 The probabilistic function CF is recursive. 


Proof. The probabilistic function CFyy is just the function obtained by 
minimizing PTC y,, which we already know to be recursive. Indeed, if y 
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corresponds to a leaf, then: 


(uPTC m)(2)(y) = PTC u(x, y)(0)- [] > PTCu(a, 2)(k) 


z<y k>0 


= PTCu(x,y)(0)- [] PTCu(a, 2)() 
Zz<y 


Zz<Yy 


PT (2, y) 
= PT 1,(2,z) = PT (2, y). 
Pees PUWAL, z) ul 


If, however, y does not correspond to a leaf, then: 


(uPTC m)(«)(y) = PTCar(x,y)(0)- T] $5 PTCu(2, 2)(k) 


z<y k>0 


= PT? ,( 0)- [> Prem(z, 2) = 


z<y k>0 


This concludes the proof. 


We are almost ready to wrap up our result, but before proceeding further, 
we need to define the function SPyy : N x N > N that, given in input a pair 
(x, y) returns the (encoding) of the string found in the configuration labeling 
the node y in CTy(x). SP yy is of course recursive. We can now prove the 
desired result: 


Theorem 3 Y@ C P&Z. 


Proof. It suffices to note that, given any PTM M, the function computed by 
M is nothing more than pgp,, © (id, CF). Indeed, one can easily realize that 
a way to simulate M consists in generating, from 2, all strings corresponding 
to the leaves of CT);(x), each with an appropriate probability. This is 
indeed what CF}, does. What remains to be done is simulating pgp,, along 
paths leading to final configurations. 


We are finally ready to prove the main result of this Section: 


Corollary 1 YR = YS 


Proof. Immediate from Theorem 2 and Theorem 3. 
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The way we prove Corollary 1 implies that we cannot deduce Kleene’s Normal 
Form Theorem from it: minimization has been used many times, some of 
them “deep inside” the construction. A way to recover Kleene’s Theorem 
consists in replacing minimization with a more powerful operator, essentially 
corresponding to computing the fixpoint of a given probabilistic function. 


3 Characterizing Probabilistic Complexity by Tier- 
ing 


In this section we provide a characterization of the probabilistic functions 
which can be computed in polynomial time by an algebra of functions acting 
on word algebras. More precisely, we define a type system inspired by 
Leivant’s notion of tiering [13], which permits to rule out functions having 
a too-high complexity, thus allowing to isolate the class of predicatively 
recursive probabilistic functions. Our main result in this section is that the 
class YY of probabilistic functions which can be computed by a PTM 
in polynomial time equals the class of predicatively recursive probabilistic 
functions. 

The constructions from Section 2 can be easily generalized to a function 
algebra on strings in a given alphabet 1, which themselves can be seen a 
word algebra W. Base functions include a function computing the empty 
string, called e, and concatenation with any character a € &, called cg. 
Projections remain of course available, while the only truly random function 
is one that concatenate a random symbol from © to a given string, called 
again rg. Composition and primitive recursion are available, although the 
latter takes the form of recursion on notation. We do not need minimization: 
the distribution a polytime computable probabilistic function returns (on 
any input) is always finite, and primitive recursion is powerful enough for 
our purposes. 

Now we give a formal definition of our functions starting from the sets 
of domain and codomain of our functions. 


Definition 23 (String Distribution) A (pseudo)distribution on W is a 
function D: W > Ro) such that >) ,cwD(w) =1. The set Pw is defined 
as the set of all (pseudo)distributions on W. 


The functions in our algebra have domain W* and codomain Pw. The 
idea, as usual, is that f(v)(w) = p means that w is the output obtained 
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from the input v with probability p. Base functions are defined as follows: 


coup {) Hwas 


otherwise; 
1 ifw=a-v; 
otherwise. 


Note that, for every v € W, the length of the word obtained after the appli- 
cation of one of the constructors cq is |v| + 1 with probability 1. Projections 
IIy, : W” — Pw are defined as follows: 


1 ifw=Um; 
0 otherwise. 


Tha(v)(w) = { 


As previously mentioned, the only truly random functions in our algebra are 
probabilistic functions in the form rg : W > Pw, which concatenate a to the 
input string (with probability 5), or leave it unchanged (with probability 5). 
Formally, 

12 ale aes 

Taw )G0) =<" 1/2> abo; 

0 otherwise. 
Next we recall the concept of composition and recurrence introduced in 
Definition 4 and Definition 5 and we instantiate them to the case of our 
algebra. We first introduce the generalized composition of functions f : W" > 
Pw, 91,---,9n : W* — Pw as the function f © (g1,...,gn) : W* > Pw 
defined as follows: 


(FO (g1s---.9n))(v))(w) = SI | F---.2zn)w) TT 90) 


Z1++-52n€W l<i<n 


Recurrence over W takes the following form: 


flé,v) = ge(v); 
f(a : w,v) a Ja(w, V, f(w,v)); 


where f : W™t! 5 Pw, gg: Wt? > Pw, for alla € © and g. : W* > Pw. 
Analogously to what we have done in Section 2 we write f = rec(ge, {ga}acr) 
in this case. The following construction is redundant in presence of primitive 
recursion, but becomes essential when predicatively restricting it. 
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Definition 24 (Case Distinction) If g- : W* > Pw and for every a € S, 
ga : W't! + Pw, the function h : W't! — Pw such that h(e,v) = g-(v) 
and h(a: w,v) = ga(w,v) is said to be defined by case distinction from ge 
and {ga}aex and is denoted as case(ge,{ga}acz)- 


In the following we will need also the following definition of simultaneous 
recursion: 


Definition 25 (Simultaneous Recursion) We say that the functions f = 
(f',...,f") are defined by simultaneous primitive recursion over a word 
algebra W from the functions g2 :W™ > W and gi: W"t™+! + W (where 
7 € {1,...,n} anda € X) if the following holds for every j and for every a: 


file, w) = gh(w); 
fila -U,W) = g(v,w, f'(v, w), tag d egw) 


A function f as defined above will be indicated with simrec) ({gZ} fe elie 


Example 2 The previous definition allows us to define, for instance, two 
functions f' and f? over a word algebra with © = {a,b}, as follows: 


file,v) =g(v) V5 € {1,2}; 
fila . w,v) = gi (w,v, f'(w,v), f?(w, v)) Vj € {1, 2}; 
fi(b-w,v) = g(w,v, fl(w,v), P(w,v)) V9 € {1,2}. 


3.1 Tiering as a Typing System 


Now we define our type system which will then be used to introduce the 
definition of the class of predicatively probabilistic functions and therefore 
to obtain our complexity result. The type system is inspired by the tiering 
approach due to Leivant [13]. The idea behind tiering consists in working 
with denumerably many copies of the underlying algebra W, each indexed 
by a natural number n € N and denoted by W,,. Type judgments take the 
form f>W,, x... x Wn, — Wm, where f : W — W. In the following, 
with slight abuse of notation, W stands for any expression in the form 
Wi, x --- x W;,. Typing rules are given in Figure 6. The idea here is that, 
when generating functions by primitive recursion, one goes from a level (tier) 
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e>W, > Wr caPW,>W, raeW,>W, Ti, e Ws, x--: x Ws, - Ws,, 


{gi> Ws, X--- x Ws, 9 War, ficicr f > Win, XX Wa, 2 Wi 
fO(g1,.--,g1)> Ws, X ++: x Ws, 9 We 


ge>W > W, ge>W > Wi m>k 
{ga> Wy x W > Wihaex {ga> Wm X W x Wi > Wehaex 
case(ge,{gataex) > Wy x W > W, rec(ge,{gataex) > Wm x W > W;, 


Figure 6: Tiering as a Typing System 


m for the domain to a strictly lower level k for the result. This predicative 
constraint ensures that recursion does not cause any complexity explosion. 

Those probabilistic functions f : W* — Py such that f can be given a 
type through the rules in Figure 6 are said to be predicatively recursive. More 
precisely, the class AJ of all predicatively recursive functions is defined as 
follows. 


Definition 26 The class AZ of predicatively recursive (probabilistic) 
functions is the smallest class of functions that contains the basic functions 
and is closed under the operations of general composition, primitive recursion, 
case distinction (Definition 24) and such that each function can be given a 
type through the rules in Figure 6. 


Next we give the definition of the class of simultaneously predicative recursive 
functions .7.7. 


Definition 27 The class. YZ of simultaneously predicative recursive (prob- 
abilistic) functions is the smallest class of probabilistic functions that contains 
the basic functions and is closed under the operations of general composition, 
simultaneous recursion (Definition 25), case distinction (Definition 24) and 
such that each function can be given a type through the rules in Figure 6, 
plus the rule below: 


{gio W > We}; m>k 
{ga > W? x Wm x W > Wadia 
simrec! ({g2}4, {gh}j) >Wm x W > W;, 
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3.2 Simultaneous Primitive Recursion and Predicative Re- 
cursion 


By closely following Leivant [13], we can show that simultaneous primitive 
recursion can be encoded into predicative recursion. Since the proof of this 
result is precisely the one given by Leivant ([13] Section 4.2), we only sketch 
the main ingredients of it here. 

According to Definition 25, if, e.g., two functions f°, f! over a word 
algebra with & = {a, b}, are defined by simultaneous recursion, then we have 
that 


P(e,x) = g(x); 
fP(a-w,x) = go (w, x, f?(w,x), f'(w,x)); 
f?(b- w,x) = gp(w,x, f?(w,x), fi (w,x)); 

fi(e,x) = 92 (x); 
fi(a-w,x) = ga(w,x, f?(w,x), fi (w,x)); 
fi(b- w,x) = gp(w,x, f?(w,x), fw, x)) 


The two functions f° and f! can indeed be computed by one function 7 
once a pairing operator (-,-) is available: 


F(e,x) = (o2(x), 92(&)); 
F(a : w,X) = (92(w, x, f(w,x)), 93(w, x, F(w,x))); 
f(b : w,X) = (gp (w, x, f(w,x)), gw, x, f(w,x))). 


The pairing function (-,-) is of course primitive recursive, and the same 
holds for the corresponding projection function. But can we give all these 
functions a “balanced” type, a type in which the tier of the argument(s) 
is the same as the tier of the output? (This is of course necessary if one 
wants to encode simultaneous primitive recursion the way suggested by the 
equations above.) A positive answer can indeed be given provided pairing 
and projections take an additional parameter (of an higher tier) big enough 
to “drive” the recursion necessary for computing pairing and projections. 
More details can be found in [13]. 


3.3. Register Machines vs. Turing Machines 


Register machines are abstract computational models which, when properly 
defined, are Turing powerful. Here we extend the classical definition of a 
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register machine to the probabilistic case. Again, the way register machines 
are defined closely follows Leivant’s proof [13]. 


Definition 28 (Probabilistic Register Machine) A probabilistic regis- 
ter machine (PRM) consists of a finite set of registers Il = {7,...,7,-} and 
a sequence of instructions, called a program. Fach register 7; can store a 
string in W, and each instruction in the program is indexed by a natural 
number and takes one of the following six forms 


e(ta);  Ca(ts)(™a); Talts)(™a); P(ts)(ta); Je(ts)(m);  Ja(ts)(m™); 
where 15,™q are registers and m is an instruction index. 


The semantic of previous instructions can be described as follows. We assume 
that the index of the current instruction is n. 
e The instruction e(7q) stores in the register tq the empty string and then 
transfers the control to the next instruction. 
e The instruction ca(7s)(7a) stores in the register 7g the term a- w, where 
w is the string contained in the register 7,. It then transfers the control 
to the next instruction. 


The instruction p(7;)(7a) is the predecessor instruction, which stores in 

the register 7g the string resulting from erasing the leftmost character 

from the string contained in 7, if any. The control is then transferred to 
the next instruction. 

e If w is the string contained in 7, the instruction rq(7;)(7a@) stores in the 
register 7g either the string w (with probability 5) or the string a- w 
(with probability 3). 

e The instruction j-(7s)(m) transfers the control to the m-th instruction if 

ms contains the empty string, and goes to the next instruction otherwise. 

The instruction ja(7s)(m) transfers the control to the m-th instruction if 

m7, contains a string whose leftmost character is a, and goes to the next 

instruction otherwise. 

We can now describe more precisely the semantics of a PRM in terms of 

configurations. 


Definition 29 (Configuration of a PRM) Let R be a PRM as in Defini- 
tion 28, and let 4 be the underlying alphabet. We define a PRM configuration 
as a tuple (v1,...,Ur,7) where: 

e each uv; € &* is the value of the register 7;; 

e nEN is the index of the next instruction to be executed. 
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We denote the set of all configurations as CRR. Ifn =1 we have an initial 
configuration for a k-tuple of strings s, which is indicated with INR». If 
n=m-+1 (where m is the largest index of an instruction in the program), 
we have a final configuration, called FCRp, where s is the string stored in 
Tis 


First we observe that the meaning of a PRM program RF can be defined 
by way of two functions d9 and 61: if the next instruction to be executed 
is rq, then d9(C) is potentially different than 6,;(C), otherwise the two are 
equal. In other words, we can consider two functions 69 : CRr > CRpR and 
61: CRR—- CReR which, given a configuration in input: 
e both produce in output the (unique) configuration resulting from the 
application of the next instruction, if different than rj; 
e produce the two configurations resulting from the two branches of the 
next instruction, if it is rg. 
Similarly to what we have done for PTMs (see Section 2), we can define a 
(complete) partial order with carrier CEVR (which is the set of all functions 
from CRpr to Py»). And hence, we can define a functional FRr on CEVR 
which will be used to define the function computed by FR via a fixpoint 
construction. Intuitively, the application of the functional Rp describes 
one computation step. More formally: 


Definition 30 Given a PRM R, we define a functional FRp : CEVR > 
CEVR as: 


fo feet if C € FCRyy; 
FRr(f)(C) = { 5 f(50(C)) aA 5 f(d1(C)) otherwise. - 


Using similar arguments to those in the proofs of Proposition 6 and Theorem 
1, we can show that the least fixpoint of F Rr actually exists. Such a 
least fixpoint, once composed with a function returning ZV Rp from s, 
is the function computed by the register machine R and is denoted by 
TOR : &* + Py. The next Lemma shows the relations between PTMs and 
PRMs. 


Lemma 7 PTMs are linear time reducible to PRMs, and PRMs over W 
are polytime reducible to PTMs. 


Proof. A single tape PTM M can be simulated by a PRM Ry, that has tree 
registers. A configuration (w,a,v,s) of M can be coded by the configuration 
(w",a,v,s) where w" denotes the reverse of the string w. Each move of M is 
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simulated by at most 2 moves of Ray. In order to simulate the probabilistic 
part given by the functions 69 and 61 we use the instructions e, rg and j, plus 
a dedicated register 7¢oin, in the natural way. Conversely, a PRM R over W 
with m registers is simulated by a PTM Mr with m tapes. Some moves of 
R may require copying the contents of one register to another for which M 
may need as many steps to complete as the maximum of the current lengths 
of the corresponding tapes. Thus if R runs in time O(n"), then Mp runs in 
time O(n?*). We can then conclude by remembering that Turing machines 
with multiple tapes can be simulated by single-tape Turing machine with a 
polynomial slowdown. 


3.4 Polytime Soundness 


In this section we prove that any function definable by predicative recurrence 
can be computed in polynomial time by a PTM. In view of Lemma 7, in 
order to obtain this result it suffices to show that predicative recurrence can 
be simulated by a probabilistic register machines working in polynomial time 
(dubbed PPRM in the following). This result is not difficult and is proved 
below by exhibiting a PPRM which computes any function f such that 
fo W > Wry. In the following we denote by YYZ the class of functions 
computed by PPRMs. 

The length |v| of a string is simply the number of characters in it. Given 
a string distribution D € Ps, its length |D| is simply the maximal length 
of strings in the support of D. Moreover, if v = (v1,...,Un) € W” and 
W = Wm, x... X Wm,, then |v|, = maxm,=% |vi|. Analogously for |v|<z 
and |v|sz. The following is again from [13]: 


Lemma 8 (Max-Plus) Jf h> W > W,,, then there is a polynomial qn : 
N-N such that for every v, it holds that |h(v)| < |v|m + an(|Vv|>m) 


Proof. This is an induction on the structure of a derivation for h>W — W,,,. 


Proposition 11 [fh>W — Wy», then there is a PPRM Ry, that computes 
h. 


Proof. The proof is by induction on the structure of a derivation for h>W —> 

Wm: 
e We first of all need to show that for every basic function, we can construct 
a PPRM that computes such a function. The proof is immediate for the 
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functions e, Cg, and rq, all of which can be easily computed by eponymous 

register machine instructions. The projections HI” (v)(w) can be simulated 

by the instruction c,, followed by p. 

Assume that h is defined by composition, namely that h = f © (gi,---,9n) : 

Ww — Pw. We give an intuitive proof by exhibiting a PPRM, called Rp, 

which computes h in polynomial time. Rj, operates by using Ry, Ry,,..., Rg, 

(all of which exist by induction hypothesis), as subroutines in the natural 

way. The fact that this process takes polynomial time is a consequence of 

the fact that the machines Ry, Ry,,..., Rg, are themselves polytime, and 
that the Ry is called on inputs of polynomial length, itself a consequence 
of Lemma 8 above. 

Assume now that hf is defined by case distinction, namely that h = 

case(ge,{ga}aex). In this case the PPRM Rp, which computes h can be 

defined as a machine which analyze one of its inputs, deciding based on 
its value (by way of instructions j- and j,) which ones among Ry., Rg, 

(where a € ¥) to call to analyze the rest of the input. Please notice that 

the machines above exist and work in polynomial time by the induction 

hypothesis. 

e Finally, assume that h is defined by primitive recursion, namely that 
h = rec(ge,{ga}acx,). In this case the PPRM R;, which computes h 
can be defined as a machine which iteratively calls as subroutines the 
machines Ry., Rg, (where a € &), which exist and work in polynomial 
time, based on the value of one its inputs. The machines above are clearly 
called a number of times linear in the size of one of the inputs, while 
the fact that each calls takes itself polynomial time is a consequence of 
Lemma 8. 


This concludes the proof. 


3.5 Polytime Completeness 


There is a relatively easy (although not elegant) way to prove polytime 
completeness of probabilistic ramified recurrence, namely going through the 
same result for deterministic ramified recurrence [13]. The argument goes as 
follows: 

e First of all, it is easy to prove that for every & and for every n the function 
fin outputting a sequence of random bits of length |s|* +n (where s is 
the input) is a ramified probabilistic function. 

e Then, one can observe that for every polynomial time computable function 
g: &* + &*, it holds that py > W, — W,, for some n and m, this as a 
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consequence of Leivant’s result [13]. 

e Finally, one can observe that any polytime probabilistic function can 
be seen as a deterministic polytime function taking an additional input 
consisting of a “long enough” sequence of random bits. 

Polytime completeness is an easy corollary of the three observations above. 
More precisely, we now present some lemmas that allow us to prove com- 
pleteness. 


Lemma 9 (Polytime Random Sequences) For every k and for every 
n, let frm be the probabilistic function outputting a sequence of random bits 
of length |s|* +n (where s is the input). Then fxn Wm — W, holds for 
some natural numbers m and Ll. 


Proof. Let q be the deterministic function on W which outputs gls|*+n 
where s is the input. Clearly, q is computable in polynomial time. As 
a consequence, py can be typed in Leivant’s system. Let randezt be the 
probabilistic function which, on input s, outputs either 0-s or 1-5, each 
with probability 5: randext can be typed with W,, > W,, for every m (it 
can be defined from rp and r; and other base functions by case distinction). 
What we need to obtain fxn, then, is just to compose a function obtained 
by primitive recursion from randezt, and py. 


The next Lemma is again due to Leivant [13]. 


Lemma 10 (Polytime Functions and Predicative Functions) For ev- 
ery deterministic polynomial time computable function g : * > X* it holds 
that pg> Wn + Wm for some n and m. 


Then we have the following. 


Theorem 4 PYABC AT. 


Proof. Consider any probabilistic polytime Turing machine M. From the 
discussion at the beginning of this section, it is clear that the probabilistic 
function computed by M is pr © Ppair © (id, fen). where f is a polytime 
computable deterministic function, pair is a deterministic function encoding 
two strings into one, and fy, is the function from Lemma 9. Since the three 
functions can be given a type, their composition itself can. 


We are finally ready to prove the main result of this section: 
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Corollary 2 PYPR= PTZ. 


Proof. Immediate from Theorem 4 and Proposition 11. 


A more direct way to prove polytime completeness consists in showing how 
single-tape PTMs can be encoded into predicative recurrence. This can be 
done relatively easily by exploiting simultaneous recursion, but we leave this 
for future work. 


4 Conclusions 


In this paper, we make a first step in the direction of characterizing probabilis- 
tic computation in itself, from a recursion-theoretical perspective, without 
reducing it to deterministic computation. The significance of this study 
is genuinely foundational: working with probabilistic functions allows us 
to better understand the nature of probabilistic computation, but also to 
study the implicit complexity of a generalization of Leivant’s predicative 
recurrence, all in a unified framework. 

More specifically, we give a characterization of computable probabilistic 
functions by a natural generalization of Kleene’s partial recursive functions. 
We then prove the equi-expressivity of the obtained algebra and the class of 
functions computed by PTMs. In the second part of the paper, we investigate 
the relations existing between our recursion-theoretical framework and sub- 
recursive classes, in the spirit of ICC. More precisely, endowing predicative 
recurrence with a random base function is proved to lead to a characterization 
of polynomial-time computable probabilistic functions. 

An interesting direction for future work could be the extension of our 
recursion-theoretic framework to quantum computation. In this case one 
should consider transformations on Hilbert spaces as the basic elements 
of the computation domain. The main difficulty towards obtaining a com- 
pleteness result for the resulting algebra and proving the equivalence with 
quantum Turing machines seems to be the definition of suitable recursion 
and minimization operators, given that qubits (the quantum analogues of 
classical bits) cannot be copied nor erased. 
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